Inference Document Type (Dtd) From Xml Document: Web Structure Mining
نویسندگان
چکیده
XML is becoming a prevalent format and defacto standard for data exchange in many applications. While traditionally, lots of data are stored and managed in relational databases. There is an urgent need to research some efficient methods to convert these data stored in relational databases to XML format when integrating and exchanging these data in XML format. The semantics of XML schemas are crucial to design, query, and store XML documents and functional dependencies are very important representations of semantic information of XML schemas. As DTDs are one of the most frequently used schemas for XML documents in these days, we will use DTDs as schemas of XML documents here. This paper studies the problem of schema conversion from relational schemas to XML DTDs. As functional dependencies play an important role in the schema conversion process, the concept of functional dependency for XML DTDs is used to preserve the semantics implied by functional dependencies and keys of relational schemas. A conversion method is proposed to convert relational schemas to XML DTDs in the presence of functional dependencies, keys and foreign keys. The methods presented here can preserve the semantics implied by functional dependencies, keys and foreign keys of relational schemas and can convert multiple relational tables to XML DTDs at the same time.
منابع مشابه
Rule Learning from Semi-structured Documents by Inductive Logic Programming
One of the hot research areas is knowledge discovery on structured documents like HTML and XML documents. In the case of XML documents, most popular approach to mining a knowledge is structural approach which find some kind of similar pattern(often tree structure or XPath) in interested XML documents. On the other hand, there is relational data mining approach such as ILP(Inductive Logic Progra...
متن کاملStructuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD
Domain-specific documents often share an inherent, though undocumented structure. This structure should be made explicit to facilitate efficient, structure-based search in archives as well as information integration. Inferring a semantically structured XML DTD for an archive and subsequently transforming its texts into XML documents is a promising method to reach these objectives. Based on the ...
متن کاملExtraction of Semantic XML DTDs from Texts Using Data Mining Techniques
Although composed of unstructured texts, documents contained in textual archives such as public announcements, patient records and annual reports to shareholders often share an inherent though undocumented structure. In order to facilitate efficient, structure-based search in archives and to enable information integration of text collections with related data sources, this inherent structure sh...
متن کاملPartitions musicales et technologies web
This papers show that new web technologies such as SVG, DOM, AJAX and CSS, are now mature enough to allow browsing of musical scores with optimal quality for the graphical and ergonomical parts, together with XML powerfull standard data-mining tools. MOTS-CLÉS : AJAX, DOM, DTD, CSS, partitions musicales, MusicXML, SAX, SVG, web.
متن کاملXML Query Processing Using Signature and DTD
Having emerged as a standard web language, XML has become the core of e-business solution. XML is a semistructured data that is represented as graph, which is a distinctive feature compared to other data dealt with existing database. And query is represented as regular path expression, which is evaluated by traversing each node of the graph. In XML document with DTD, the DTD may be able to prov...
متن کامل